Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries

نویسندگان

Daniel Garcia-Romero

Carol Y. Espy-Wilson

چکیده

This paper presents a reinterpretation of Joint Factor Analysis as a signal approximation methodology―based on ridge regression―using an overcomplete dictionary learned from data. A non-probabilistic perspective of the three fundamental steps in the JFA paradigm based on point estimates is provided. That is, model training, hyperparameter estimation and scoring stages are equated to signal coding, dictionary learning and similarity computation respectively. Establishing a connection between these two well-researched areas opens the doors for cross-pollination between both fields. As an example of this, we propose two novel ideas that arise naturally form the non-probabilistic perspective and result in faster hyperparameter estimation and improved scoring. Specifically, the proposed technique for hyperparameter estimation avoids the need to use explicit matrix inversions in the M-step of the ML estimation. This allows the use of faster techniques such as Gauss-Seidel or Cholesky factorizations for the computation of the posterior means of the factors x, y and z during the E-step. Regarding the scoring, a similarity measure based on a normalized inner product is proposed and shown to outperform the state-of-the-art linear scoring approach commonly used in JFA. Experimental validation of these two novel techniques is presented using closed-set identification and speaker verification experiments over the Switchboard database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rate-Distortion Analysis of Sparse Overcomplete Codes

Transform coding is a popular coding strategy with many desirable properties. The performance of a transform coder relies on the compaction of energy in a small number of coefficients in the transform domain. Most transform coders rely on linear transforms with orthogonal dictionaries, however, linear transforms are limited in that they often only exploit some of the signal structure. For more ...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Highly overcomplete sparse coding

This paper explores sparse coding of natural images in the highly overcomplete regime. We show that as the overcompleteness ratio approaches 10x, new types of dictionary elements emerge beyond the classical Gabor function shape obtained from complete or only modestly overcomplete sparse coding. These more diverse dictionaries allow images to be approximated with lower L1 norm (for a fixed SNR),...

متن کامل

Simultaneous denoising and compression of power system disturbances using sparse representation on overcomplete hybrid dictionaries

This study introduces a novel unified framework for simultaneous denoising and compression of electric power system disturbance signals using sparse signal decomposition and reconstruction on overcomplete hybrid dictionary (OHD) matrix. In the proposed method, the power quality signal is first decomposed into deterministic sinusoidal components and non-deterministic components using the OHD mat...

متن کامل

Learning Overcomplete Representations

In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete code...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries

نویسندگان

چکیده

منابع مشابه

Rate-Distortion Analysis of Sparse Overcomplete Codes

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Highly overcomplete sparse coding

Simultaneous denoising and compression of power system disturbances using sparse representation on overcomplete hybrid dictionaries

Learning Overcomplete Representations

عنوان ژورنال:

اشتراک گذاری